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Abstract 

Epidemic models are always simplifications of real world epidemics. Which real 
world features to include, and which simplifications to make, depend both on the 
disease of interest and on the purpose of the modelling. In the present paper we 
discuss some such purposes for which a stochastic model is preferable to a determin- 
istic counterpart. The two main examples illustrate the importance of allowing the 
infectious and latent periods to be random when focus lies on the probability of a 
large epidemic outbreak and/or on the initial speed, or growth rate, of the epidemic. 
A consequence of the latter is that estimation of the basic reproduction number Rq 
is sensitive to assumptions about the distributions of the infectious and latent pe- 
riods when using the data from the early stages of an outbreak, which we illustrate 
with data from the SARS outbreak. Some further examples are also discussed as 
are some practical consequences related to these stochastic aspects. 
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1 Introduction 

Mathematical epidemic models describe the spread of an infectious disease in a community 
(e.g. Bailey, 1975, Anderson and May, 1991, Diekmann and Heesterbeek, 2000). A model 
can be used to derive various properties of an outbreak, such as: whether or not a big 
outbreak may occur, how big the outbreak will be, or the endemic level in case the 
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disease becomes endemic. From a statistical/epidemiological point of view the model and 
its analysis may be used to estimate important epidemiological parameters from observed 
outbreak data. These estimates can then be used to study effects of potential interventions 
to stop or reduce the spreading of the disease. For example, an endemic disease may go 
extinct if a vaccination program is launched having high enough vaccination coverage (e.g. 
Anderson and May, 1991, pp87, and Gay, 2004), or an outbreak may be stopped during 
the early stages of an outbreak if spreading parameters are reduced enough by means of 
different sorts of intervention (e.g. Anderson et al., 2004, for an apphcation to SARS). 

Mathematical models arc always simplifications of reality, but the hope is that the 
simplifications have little effect on the epidemic properties of interest. Simple models have 
the advantage of being tractable to analysis and quite often allow for explicit solutions 
admitting general qualitative statements. Their main disadvantage is of course that they 
may be too simplistic for the conclusions to be valid also for real world epidemics. Adding 
more complexity to the model increases realism but usually makes it harder to analyse 
and also introduces more uncertainty by having more parameters. More complex models 
are usually analysed by means of numerical solutions to differential equations, or from 
numerous stochastic simulations. 

The most important features to include to make an epidemic model more realistic 
(and at the same time harder to analyse) are to incorporate individual heterogeneity (e.g. 
Anderson and May, 1991, pp 175) and/or structured mixing patterns (e.g. House and 
Keeling, 2008, for a deterministic household model). Another step in making a model more 
realistic is to make certain features random, for example the actual transmission/contact 
process but also possibly susceptibility, social structures, the latent period and/or the 
infectious period. Such stochastic models thus allow individuals to behave different from 
each other in a way that is specified by random distributions (e.g. Bailey, 1975, Andersson 
and Britton, 2000a). 

Which complexities to include in the model, and which not to, depend both on the type 
of disease in question and on the scientific question motivating the study. The aim of the 
present paper is to illustrate some aspects where stochasticity matters. More precisely 
we focus on two features, the risk for an outbreak and the initial growth rate of the 
epidemic, and we illustrate that they depend heavily on assumptions about the latent and 
infectious periods; not only on their mean durations but also on their randomness. As a 
consequence, the (stochastic) distribution of these periods are important when addressing 
questions relating to these two features - using an over-simplified stochastic model or a 
deterministic model will give misleading results. For example, estimating Rq from the 
initial phase of an epidemic is hard without additional knowledge about the distributions 
of the infectious and latents periods, a fact which we illustrate using data from the SARS 
outbreak. We illustrate our results using a simple epidemic model, but the qualitative 
conclusions hold also for more realistic models allowing other heterogeneities. We note 
that other features of the model, e.g. the basic reproduction number and the outbreak size 
in case of a major outbreak, hardly depend on the randomness of the latent and infectious 
periods at all, so having a deterministic latent and infectious period may be appropriate 
when addressing other questions. 

Most results presented in this paper are not new but have appeared elsewhere or 
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are "folklore" among stochastic epidemic modellers, but are perhaps less known outside 
this community. The aim of the paper is hence to gather and present the results in a 
simple form reaching outside the community of stochastic epidemic modellers. The rest 
of the paper is outlined as follows. In Section [2] we present the standard stochastic SEIR 
epidemic model for a homogeneously mixing community of homogeneous individuals. In 
Section [3] properties of the model are presented and illustrated. In Section |4] we interpret 
the results in more epidemiologically relevant formulations and illustrate where it can 
make a difference. In the discussion we briefly describe, and give references to, some 
other situations where stochasticity of some form affect certain features of the epidemic 
model. 

2 A simple stochastic epidemic model 
2.1 Definition 

We now define what we call the standard susceptible-exposed-infectious-removed (SEIR) 
epidemic model. Consider a homogeneously mixing community consisting of n homo- 
geneous individuals, where n is assumed to be large. A transmittable disease is spread 
according to the following rules. Initially a small number, fc, individuals are infectious 
and the rest of the community are susceptible to the disease (immune individuals are 
simply neglected). Each individual who gets infected is at first latent (exposed but not 
yet infectious) for a random period L with distribution F^. After the latent period has 
ended the infectious period starts and lasts for a period / having distribution Fj. All 
infectious periods and latent periods are assumed to be mutually independent. While 
infectious an individual has random "infectious contacts" at rate A, each contact is with 
a randomly chosen individual, so the contact rate with a specific individual is A/n (or 
more correctly A/ (n — 1) but when n is large this distinction is irrelevant). Contacts with 
susceptible individuals result in infection (and their latent period starts); contacts with 
non-susceptibles have no effect. Once the infectious period is over the individual is said 
to be removed, meaning that the person has recovered and become immune, and plays 
no further role in the epidemic. The epidemic goes on until there are no more infectious 
or latent individuals, then the epidemic stops. Let T denote the (random) number of 
individuals who get infected during the outbreak, and that hence are removed at the end 
of the epidemic. T is often called the final size of the epidemic, and p = T /n denotes the 
final proportion infected during the outbreak. 

In what follows we will restrict ourselves to the case where L and / have different and 
independent Gamma distributions, this being a rather flexible family of distributions. We 
parametrise these distributions by their means, /i^ = E{L) and yU/ = E{I) (> 0), and 
their coefficients of variation = ^JV{L)/ E{L) and r/ = y^V{I)/E{I) > 0, where V{-) 
denotes the variance. 
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2.2 The basic reproduction number Rq 



The perhaps most important property of an epidemic model is the basic reproduction 
number, denoted Rq, which for the present model can be defined as the average number of 
infections caused by a typical infective when the disease is introduced into the population. 
For the present model it is easy to show that 

i?o = AE(/) = A/i/. 

The basic reproduction number determines both if a major outbreak is possible, and if 
so, also the final proportion infected in case there is a major outbreak. More precisely, it 
can be shown that p, the ultimate proportion infected, will in a large community be close 
to p, which solves 

l-p = e-^°P. (2.1) 

It is easy to see that p = (corresponding to a minor outbreak) is always a solution 
to (12. ip . If -Rq ^ 1) this is in fact the only solution, meaning that a major outbreak is 
impossible. If i?o > 1 there is also a unique strictly positive solution p* (0 < p* < 1) 
corresponding to a major outbreak. 

As was seen above, Rq only depends on the mean of the infectious period - not on its 
randomness nor on the latency period. The model can be extended to allow for a (perhaps 
random) time- varying infectivity A(s) over the infectious period (0 < s < / < oo). Then 
Rq = E^Jq \{s)ds), the expected accumulated infectivity. As before, Rq determines both 
if a major outbreak is possible, and if so, how big the outbreak will be. In fact, the 
complete (random) distribution of the final size, for any finite n, can be shown to depend 
only on the distribution of the accumulated infectivity X{s)ds (Ball, 1986), how the 
infectivity is distributed over time only affects the time dynamics of the epidemic and not 
the final size. 

3 Model properties affected by randomness 

In the previous section it was shown that Rq only depends on the mean length of the 
infectious period and not at all on the latent period. In the present section we study two 
features, the probability of a major outbreak and the initial growth rate of the epidemic, 
where the randomness of the infectious period and also the latent period do matter. In 
the discussion we briefly mention some other aspects where stochasticity matters. 

3.1 The probabiUty of a major outbreak 

When the community n is large, the initial phase of the epidemic may be approximated by 
a branching process (Ball, 1986). The reason for this is that new contacts will most likely 
be with not yet contacted people, so new infectives infect (="give birth" in branching pro- 
cess terminology) independently which is the crucial underlying assumption in branching 
processes. The branching process corresponding to our model is the Sevastyanov model 
(Jagers, 1975, p 8). Infections correspond to births in the branching process, the latency 
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period to infancy in the branching process and the infectious period to the reproductive 
hfe stage (hfe stages after the reproductive stage play no role for population growth just 
like with removed individuals in the epidemic). 

Let TT denote the probability of a large outbreak (corresponding to infinite growth 
of the approximating branching process) when starting with k = 1 infectious individual. 
From branching process theory it can be shown that vr is the largest solution to the balance 
equation 

l-7r = i5;[(l-vr)^], (3.1) 

where X is the (random) number of births of a typical individual in the branching process. 
The balance equation is obtained by conditioning on the number of births of the first 
individual: if the first individual has X = x births during her life, all these individuals 
must avoid causing infinite growth. They do this independently, so the probability for 
this to happen is (1 — vr)^. 

For our model, with constant birth/infection rate A during a Gamma distributed 
infectious period with mean fij and coefficient of variation tj, the distribution of X, and 
hence also E[{1 — tt)^], can be computed explicitly. By first conditioning on the length 
of the infectious period I = y it is easy to show that X then is Poisson distributed with 
mean Xy, and removing the conditioning makes X follow a negative binomial distribution. 
Using this it can be shown that Equation (13. ip simplifies to 

(this relation can also be found in Asikainen, 2006, p 28). If for example tj = 0, implying 
that the length of the infectious period is non-random, vr is the largest solution to 1 — vr = 
g-7r_Ro -yyiiich is obtained by taking limits of (13. 2p when r/ — 0. If r/ = 1, corresponding to 
an exponentially distributed infectious period, we have that vr = 1 — l/i?o which is clearly 
different. 

By studying the balance equation (13. 2p it is possible to see how tt, the probability 
of a major outbreak, depends on model parameters. The first conclusion is not very 
surprising: vr is increasing in Rq = Xfij and hence also in the contact rate A and in 
the mean infectious period /i/. A less obvious conclusion is that vr is decreasing in tj 
(the coefficient of variation of the infectious period). In other words, the more random 
the length of the infectious period is, the less likely is a major outbreak. Finally, vr is 
independent of fi^ and r^. 

In Figure [T] we have plotted vr as a function of tj in the range (corresponding to a 
deterministic infectious period) to 3 (being a very random infectious period), for three 
choices of Rq. It is seen that tj is quite infiuential. For example, if i?o = 3 and tj = 0, 
then TT ^ 0.940. If Rq is reduced to 1.5 and tj is unchanged we get tt ^ 0.583, whereas 
if we instead keep Rq unchanged (at Rq = 3) and increase tj to 1, then tt ^ 0.667. It is 
hence seen that the variation in the infectious period is as important as Rq for determining 
the probability of a major outbreak. 

The probability vr defined above was for the case that the epidemic starts with 1 
initially infectious. More generally, we can define iik as the probability of a major outbreak 
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Figure 1: The probability of a large outbreak, vr, as function of r/, the coefficient of 
variation of the infectious period. 



starting with k initially infectious individuals (so tti = tt). Since, for an epidemic not to 
take off, none of the initially infectives must initiate a major outbreak. As a consequence, 
TTfc can be expressed in terms of tti = vr as 



where tt is the solution to fl3.2p . In Figure [2] vr^. is plotted as a function of k for the cases 
vr = 0.25 and tt = 0.5. It is seen that vr^ grows quickly up towards 1, implying that the 
outbreak probability is close to 1 when initiated by many individuals as long as -Rq > 1 
and Tj is not very large (meaning that infectious period is not extremely varying). 

The distribution of the infectious period is hence mainly of interest when the epidemic 
is initiated by rather few individuals. 

3.2 The initial growth rate of the epidemic 

We now study another property which is heavily influenced by both the latent and infec- 
tious periods, their mean durations as well as their randomness: the initial growth rate 
of the epidemic. As before we assume that the community size n is large. 

In Section [2] it was shown that an epidemic can only take off if Ro = A/i/ > 1. Since we 
now focus on the growth rate of the epidemic we assume this to be the case. As mentioned 
before, the early stages of the epidemic in a large community can be approximated by a 
branching process. Since Rq > 1 the branching process is said to be super-critical, and 
if the epidemic/branching process takes off branching process theory (e.g. Jagers, 1975) 
tells us that the epidemic will grow at an exponential rate during the initial phase. More 
precisely, in case of a major outbreak, the number of infectious individuals at t, I{t), will 
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Figure 2: The probability vr^, of an outbreak wlien k infectious individuals enters the 
population. 

satisfy I{t) ~ e"* for some a. The parameter a, denoted the Malthusian parameter, is 
known to solve 



where L and I are the (random) durations of the latent and infectious periods respectively. 
In the present paper L and / are assumed to be independent Gamma distributions with 
means fij and and coefficients of variation tj and respectively. The solution a to 
f l3.4p can then be shown to solve 



(see the Appendix for details). 

It can be shown that the exponential growth rate a (i.e. the solution to (13. 5p ) depends 
monotonically on all four parameters of the latent and infectious periods, fij, fii, t/ and 
Ti, keeping Rq fixed. As for the mean infectious and latent periods, /x/ and /i^, the growth 
is decreasing. This is not surprising: the longer the infectious period (keeping _Ro = ^fJ^i 
fixed!) the slower the epidemic will grow, and the same applies to the situation where 
a latent period becomes longer on average. Perhaps more surprising is that a depends 
monotonically on the coefficients of variation tl and r/, and in different ways! It can be 
shown from (13.51) that the growth rate is increasing in Tl but decreasing in tj. In other 
words, a more random latent period increases the growth rate whereas a more random 
infectious period decreases it. 

A heuristic motivation for the different monotone dependence of the coefficients of 
variation goes as follows. Consider first two alternatives for the infectious period assuming. 




(3.4) 
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for simplicity that there is no latency period: two infectious periods both being two time- 
units long (corresponding to small r/) and the other scenario having one infectious period 
of length 1 and the other of length 3, thus having the same mean fij but larger tj. During 
the first time-unit both scenarios will have two persons infecting but during the second 
time-unit the first scenario (small r/) will still have two persons infecting, but the second 
scenario only one person. During the third time- unit the second scenario will " catch up" 
in infecting new people by having one person infecting (as opposed to no one for the second 
scenario) but the first scenario will clearly infect new individuals at an earlier state in 
time thus resulting in higher growth rate a. This motivates why a is decreases in tj. The 
motivation for the growth rate being increasing in tl is similar. Suppose that we have two 
alternative scenarios similar to before: two latent periods of equal length two time-units, 
or one of length 1 and one of length 3, and assume for simplicity that all infectious periods 
last one time-unit in both scenarios. In the first scenario the two individuals will infect 
others between time 2 and 3 whereas in the second scenario one person will infect between 
time 1 and 2 and the other between 3 and 4. The second scenario (with higher tl) will 
have a higher growth rate because of the multiplicative effect the first person's infections 
will cause: these people will start new epidemic outbreaks at an earlier state. 

In Figure E] the exponential growth rate a is computed numerically for the case Rq = 2, 
this being a common value for diseases like influenza (e.g. Mills et al., 2004). In each of 
the four sub-plots, one parameter is varied over an interval (1 to 14 days for the mean 
durations fiL and /x/ and to 3 for the coefficients of variation tl and tj) keeping the 
remaining parameters constant. The means are set to 7 days and the coefficients of 
variation to 3/7 (corresponding to a standard deviation of 3 days) when not varied. 

From the figure it is clear that all four parameters /i^, /i/, and tj are quite influential 
for the initial growth rate of the epidemic. As mentioned above, the growth rate is 
decreasing in the two mean durations (recall that the expected accumulated infectivity 
Rq = Xfii is kept flxed, so when /// changes, so does A). As for the coefficients of variation, 
the growth rate a decreases with tj but increases with r^. 

4 Practical relevance 

4.1 Estimating Rq from growth rate needs prior knowledge 

Our first, and perhaps most important observation, lies in the consequences of knowing 
that the growth rate depends heavily on all of the parameters /x^, /i/, and tj, and not 
only Rq. This implies that it is harder to estimate Rq from only observing the early stages 
of an epidemic as we now illustrate. 

Recently, an important area in infectious disease epidemiology has been to analyse 
emerging infectious diseases, for example SARS (e.g. McLean et al., 2005) and the fear 
for a pandemic influenza (e.g. Ferguson et al., 2004). One important task when analysing 
emerging infectious diseases is to estimate Rq using data from the initial phase of the 
epidemic. Such data sets typically consists of the number of diagnosed cases (per day 
or per week) over a certain observation period, typically weeks or months. One can 
argue that the number of diagnosed cases roughly corresponds to the number of recovered 
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Figure 3: Plots of the initial exponential (per day) growth rate a as function of the 
model parameters. Parameters not varied are set to: Rq = 2, hl = ni = 7 tl = ti = |. 



individuals, and using branching process theory it can be shown that this number will have 
the same growth rate as the number of infectives. If we let R{t) denote the accumulated 
number of removed individuals up to time t, it is known from branching process theory 
that 

R{t) ^ We'^\ (4.1) 

where W is random variable, the same for all t, and a is the Malthusian parameter 
treated in Section 3.2. If we look at the ratio of the number of removed individuals for 
two different observation times it follows that R{ti)/R(to) ~ ^Hh-to) implying that we 
can estimate the growth rate a by 



a 



log(i?(ti)) - log(i?(to)) 



ti — tn 



(4.2) 



The time-points to < ti should be chosen such that the epidemic has really taken off at 
to and not too many should have been infected by ti. 

The remaining problem lies in making conclusions about Rq from the estimate a. 
Equation (13. 5p gives a one-to-one correspondence between a and Ro when the model 
parameters for the latent and infectious period are given. Rearranging Equation (13.51) 
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i;ives the following expression for Rq 



Kq = afxj— ^ (,4.oJ 

1 - - 



However, for emerging infectious diseases the parameters of the infectious and latent 
periods are rarely known. The best one can hope for are some crude estimates. This will 
induce uncertainty in the estimate for Rq no matter how precise the estimator a is. 

We now illustrate this using WHO data from the SARS outbreak (WHO webpage). 
Our model is of course unrealistic for this outbreak in several aspects as we are neglecting 
other community heterogeneities. However, the same qualitative conclusions would hold 
also for more realistic models. In Figure H] part of a large outbreak of SARS in China is 
illustrated. It shows the incidence and accumulated number of diagnosed SARS cases by 
the day, between April and June in 2003. 




Figure 4: Sars outbreak in China 2003.04.02 - 2003.06.02. Data from WHO. 



From this data we estimate the growth rate a using (14.21) . Rather than estimating 
a from one time interval (to, ^i) we take several, thus getting several a-estimates. We 
then take the mean of these estimates as our final estimate. More precisely we took the 
intervals (^0,^1)= (10,20), (10,25), and (15,25), all three representing the early stages 
of the epidemic neglecting the very first bit and stopping before the speed really starts 
dropping. The resulting a-estimates were di = 0.071, a2 = 0.054, and as = 0.034. We 
take the mean of these values as our final estimate: a = 0.0530. (We will use the estimate 
to illustrate that a range -Ro-values are consistent with this estimate, the exact value of 
a is of secondary importance.) 

Given the estimate {a = 0.0530) we now use Equation (13.51) to see what we can say 
about Rq. The disappointing answer is that, unless we assume some prior knowledge 
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about the latent and infectious periods, we can hardly say anything about Rq, except 
that Rq > 1 since the epidemic is taking off. In order to say more about Rq one needs 
either more detailed data or some other knowledge about the latent and infectious periods. 
If infections are contact-traced it is possible to make inference on the generation times. 
However estimating model parameters from such inference is far from simple (Svensson, 
2007). If such information is not available, Rq can be estimated by assuming interval 
ranges for each model parameter, ranges within which the true parameter values are 
believed to lie. To illustrate this from the SARS data we choose the following intervals: 
fii and /ii is assumed to lie between 3 and 11 days (with 7 days as mid-point), and the 
coefficients of variation are assumed to lie between and 4/7 (corresponding to 4 days 
for the mid-points above). In Table [1] the Rq estimate, based on (13. 5p . a = 0.053 and 
current values of /x^, fij, tl and r/, is listed for each of the 16 combinations interval end- 
points. The point estimate when each parameter takes on the mid- value (yUL = yU/ = 7 
and Tl = Tj = 2/7 and a = 0.053) equals Rq = 1.747. 
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Table 1: Estimates of Rq from SARS outbreak in China 2003.04.02 - 2003.06.02. within 
different assumptions of model parameters and a = 0.053. Data from WHO. 

As can be seen from the table the estimate Rq depends quite a lot on our assumptions 
about the latent and infectious periods. The smallest estimate is Rq = 1.2897 obtained 
when fiL, fii and t/ are at their minimal possible point and where tl is at its maximal 
point. The largest estimate is Rq = 2.5111 obtained for the "opposite" parameter choices. 
Within the range of "possible" parameter values for the latent and infectious periods, the 
Rq estimate hence changes by a factor 2. It is hence hard to make precise estimates of Rq 
without other sources of information regarding the latent and infectious periods. 

This illustrates that an estimate of Rq using data from the initial growth is quite 
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uncertain except in the rare case that the parameters of the latent and infectious periods 
are known with fairly high precision. 

4.2 Estimating variability from final size data 

In Section 13.11 it was shown that, for fixed Rq, the more random the infectious period 
is, the more unlikely is a large outbreak (the same conclusion holds when other factors, 
e.g. susceptibilities and/or infectivities, are varied, Andersson and Britton, 2000a and 
references therein). This observation can be used to say something about the randomness 
of the infectious period (and/or of individuals) from final size data, i.e. data lacking any 
time measurements. If we observe the final proportion infected p in a large outbreak we 
estimate Rq using Equation (12.11) . which gives the estimate 

Ro = ZMLZH. (4.4) 

The information about tj lies in the fact that a major outbreak took place, an event 
with small probability when tj is large. In the Bayesian framework this can be illustrated 
by comparing the prior distribution p{tj) with the posterior distribution p{tj\p). Using 
Bayes formula we get 

p{ti\p) ccp{p\ti)p{ti), 

i.e. the posterior distribution equals the prior distribution multiplied by p{p\ti), the prob- 
ability of a large outbreak, denoted vr in Section [XTl There it was shown that n = p{p\tj) 
was decreasing in tj, the coefficient of variation of the infectious period. So, any prior 
knowledge about tj is shifted towards smaller values in the posterior distribution. The 
same type of conclusion also applies to other individual heterogeneities: the fact that a 
major outbreak has occurred shifts any prior knowledge about individual variation to- 
wards less variation. 

Of course, more detailed data containing time-measurements, or data from more than 
one outbreak is to be preferred. But, if no such data is available, any prior information 
about the randomness of the infectious period is shifted towards smaller values of r/ when 
inference is based on final size data from one major outbreak. In FigureOwe illustrate this 
for the case that tj has an exponential distribution with mean 0.5 as prior distribution, 
and where the posterior distribution is based on a major outbreak resulting in 50% getting 
infected. 

It is seen that the posterior distribution is not more "concentrated" compared with 
the prior distribution, as is usually the case. Instead the posterior distribution is merely 
shifted towards smaller values implying that the belief after observing an epidemic with 
50% getting infected results in that the posterior favors smaller values of tj (the coefficient 
of variation of the infectious period), as compared to prior beliefs. The posterior mean 
of Tj is ??, The same qualitative conclusion, that a major outbreak results in higher 
posterior belief for small coefficient of variation of the infectious period, holds for any 
fraction getting infected and any prior distribution for tj. 
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Figure 5: Exponential prior distribution function of r/ with mean 0.5, and posterior 
distribution of tj after observing an outbreak resulting in 50% getting infected, with 
mean 0.384. 

5 Discussion 

In the present paper we have tried to motivate the use of stochastic models when studying 
certain features in epidemics. First it was illustrated that the probability for a major 
outbreak is greatly affected by the randomness of the infectious period, or more generally, 
the randomness of the "infectivity" exerted by an individual. The more variation the 
distribution of the infectious period contains, the less likely is a major outbreak. As a 
consequence, observed epidemics (major outbreaks) will tend to originate from diseases 
with infectious periods not having very skew/heavy-tailed distributions. It was also shown 
that the probability of a major outbreak is unaffected by a latency period of arbitrary 
length. The latter result relies on the assumption that individuals do not change behaviour 
as the epidemic progresses nor that preventive measures are put into place - then a latency 
period will have an effect. 

The second feature studied was the initial exponential growth rate. This rate was 
shown to depend heavily on both the latent and infectious periods, there means as well 
as their randomness. From a practical perspective this implies that, unless additional 
information about the infectious period and latency period distributions is available, it is 
very hard to estimate the basic reproduction number Rq (and effects of possible preventive 
measures) from the exponential growth rate of the initial outbreak phase. 

There are also other features in epidemics affected by randomness and not only mean 
values. Common for most of these situations are that, for some type of event, only few 
random objects are influential. One such feature is the time to disease extinction of en- 
demic diseases: before disease extinction only few are infectious. For example, Andersson 
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and Britton (2000b) show that not only the means but also the coefficients of variation of 
the latency period, infectious period and life-duration affect the time to extinction when 
starting at the endemic level. 

Another feature affected by randomness is vaccine response. Two models for vaccine 
response are the leaky model and the all-or-nothing model (Halloran et al., 1992). The 
leaky model assumes that each person vaccinated has a susceptibihty that is reduced 
by a factor e (for efficacy). The all-or-nothing model instead assumes that a proportion 
e are completely immune whereas the remaining proportion vaccinated are unaffected 
by the vaccine. Here too, e is called efficacy. In both cases, the relative risk that a 
vaccinated person gets infected by an infectious contact is 1 — e (so the person avoids 
infection due to the vaccine with probability e). Even though the two models have the 
same "efficacy" their effect is different. In fact, a leaky vaccine always reduces the spread 
less than an all-or-nothing vaccine with the same efficacy - so the randomness in vaccine 
effect matters. A simple explanation to this is the following (Ball and Becker, 2006). 
Both vaccine models have the same probability of infection (1 — e) at the ffist contact 
with an individual. However, among the vaccinated people who escape infection upon 
the ffist contact, people vaccinated with a leaky vaccine still have relative susceptibility 
1 — e whereas those with the all-or-nothing vaccine escaping infection the ffist time all 
have the "all" -effect and are hence completely immune. As a consequence, the final size 
in case of a major outbreak will be smaller with an all-or-nothing vaccine as compared 
to a leaky vaccine having the same efficacy e (0 < e < 1). Still, both vaccine responses 
have the same critical vaccination coverage Vc = (1 — I/-R0)) nieaning that the same 
fraction has to be vaccinated with either vaccine in order to obtain herd immunity. 

As pointed out there are many other features less influenced by stochasticity, for 
example Rq. In the present paper we simply focus on aspects where stochasticity does 
matter. 

Needless to say, the model we have studied is by no means fully realistic. Important 
extensions are for example to allow for different types of individuals having different 
susceptibility, infectivity and/or mixing patterns, e.g. households with higher contact 
rates within households (since households are small, stochasticity play a roll also here, cf. 
Ball et al., 1997). However, the features considered in the present paper are still valid 
under such more realistic models. 

Appendix: The Malthusian parameter 

The Malthusian parameter a is the solution to (13. 4p . To begin with. 




14 



Hence (13.41) equals 
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a 

The second equality follows from partial integration and identifying the laplace transforms 
of the latent and infectious periods: v^L(a) = E{e~"^) = e"^ fL{s)ds, and similar for 
the infectious period. The infectious period / is gamma-distributed. Using first the more 
common parametrization / ~ T{aj,Pj) we get 

Mc^) = (-2) 



Pi + a 

Since also the latent period is gamma distributed we also have that v'L(a) 
Thus, by (CI]) and ([]2]), Equation (13. 4p is simplified to 



L 



If we convert to the more interpretable parameters mean /i and coefficient of variation r 
we have that fij = aj/Pj and tj = \j ^foTi, and similarly for the latent period. Equation 
O) can then, after some simple algebra and using that = \fj,i, be written as 

Ro I f, 1 

a = — 1 



fii (1 + arlfiL^/^l V (1 + arffijy/^f 
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